Efficient Filtering of XML Documents for Selective Dissemination of Information

نویسندگان

  • Mehmet Altinel
  • Michael J. Franklin
چکیده

Information Dissemination applications are gaining increasing popularity due to dramatic improvements in communications bandwidth and ubiquity. The sheer volume of data available necessitates the use of selective approaches to dissemination in order to avoid overwhelming users with unnecessaryinformation. Existing mechanisms for selective dissemination typically rely on simple keyword matching or “bag of words” information retrieval techniques. The advent of XML as a standard for information exchangeand the development of query languages for XML data enables the development of more sophisticated filtering mechanisms that take structure information into account. We have developed several index organizations and search algorithms for performing efficient filtering of XML documents for large-scale information dissemination systems. In this paper we describe these techniques and examine their performance across a range of document, workload, and scale scenarios.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach to Filtering of XML Streaming Data

Information processing and retrieval in many applications needs filtering of the XML streams. A streamfilter system examines queries on a continuous stream of XML documents and delivers matched content to the user. This paper proposes a new algorithm named PFilter for stream filtering systems. The PFilter processes a large amount of XPath query expressions to provide the desired XML nodes. PFil...

متن کامل

XPath-Aware Chunking of XML-Documents

Dissemination systems are used to route information received from many publishers individually to multiple subscribers. The core of a dissemination system consists of an efficient filtering engine deciding what part of an incoming message goes to which recipient. Within this paper we are proposing a chunking framework of XML documents to speed up the filtering process for a set of registered su...

متن کامل

Structural Similarity Evaluation Between XML Documents and DTDs

The automatic processing and management of XML-based data are ever more popular research issues due to the increasing abundant use of XML, especially on the Web. Nonetheless, several operations based on the structure of XML data have not yet received strong attention. Among these is the process of matching XML documents and XML grammars, useful in various applications such as documents classifi...

متن کامل

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

Content-Driven Secure and Selective XML Dissemination

Collaborating on complex XML data structures is a non-trivial task in domains such as the public sector, healthcare or engineering. Specifically, providing scalable XML content dissemination services in a selective and secure fashion is a challenging task. This paper proposes a publish/subscribe infrastructure to disseminate enterprise XML content utilizing document semantics. Our approach reli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000